Blog Classification: Adding Linguistic Knowledge to Improve the K-NN Algorithm
نویسندگان
چکیده
Blogs are interactive and regularly updated websites which can be seen as diaries. These websites are composed by articles based on distinct topics. Thus, it is necessary to develop Information Retrieval approaches for this new web knowledge. The first important step of this process is the categorization of the articles. The paper above compares several methods using linguistic knowledge with k-NN algorithm for automatic categorization of weblogs articles.
منابع مشابه
Coupling K-nearest Neighbors with Logistic Regression in Case-based Reasoning
Case-based reasoning (CBR) systems use similarity functions to solve new problems with past situations. K-nearest neighbors algorithm (K-NN) have been used in CBR systems to define new cases status according to characteristics of past nearest cases. We proposed a new hybrid approach combining logistic regression (LR) with K-NN to optimize CBR classification. First, we analyzed the knowledge dat...
متن کاملA comparative study of performance of K-nearest neighbors and support vector machines for classification of groundwater
The aim of this work is to examine the feasibilities of the support vector machines (SVMs) and K-nearest neighbor (K-NN) classifier methods for the classification of an aquifer in the Khuzestan Province, Iran. For this purpose, 17 groundwater quality variables including EC, TDS, turbidity, pH, total hardness, Ca, Mg, total alkalinity, sulfate, nitrate, nitrite, fluoride, phosphate, Fe, Mn, Cu, ...
متن کاملOptimized Seizure Detection Algorithm: A Fast Approach for Onset of Epileptic in EEG Signals Using GT Discriminant Analysis and K-NN Classifier
Background: Epilepsy is a severe disorder of the central nervous system that predisposes the person to recurrent seizures. Fifty million people worldwide suffer from epilepsy; after Alzheimer’s and stroke, it is the third widespread nervous disorder.Objective: In this paper, an algorithm to detect the onset of epileptic seizures based on the analysis of brain electrical signals (EEG) has b...
متن کاملModification of the Fast Global K-means Using a Fuzzy Relation with Application in Microarray Data Analysis
Recognizing genes with distinctive expression levels can help in prevention, diagnosis and treatment of the diseases at the genomic level. In this paper, fast Global k-means (fast GKM) is developed for clustering the gene expression datasets. Fast GKM is a significant improvement of the k-means clustering method. It is an incremental clustering method which starts with one cluster. Iteratively ...
متن کاملImprovement of the Effective Components in the PDR Positioning Method Based on Detecting the User’s Movement Mode Using Smartphone Sensors
The purpose of this paper is to evaluate and improve the accuracy of indoor positioning using smartphone sensors based on Pedestrian Dead Reckoning (PDR) method. In some specific situations, such as fires or power outages that disable infrastructure-based positioning techniques, using PDR method based on smartphone sensors that perform positioning continuously is a good solution.This paper focu...
متن کامل